A Survey of Distance Metrics for Nominal Attributes
نویسندگان
چکیده
Many distance-related algorithms, such as knearest neighbor learning algorithms, locally weighted learning algorithms etc, depend upon a good distance metric to be successful. In this kind of algorithms, a key problem is how to measure the distance between each pair of instances. In this paper, we provide a survey on distance metrics for nominal attributes, including some basic distance metrics and their improvements based on attribute weighting and attribute selection. The experimental results on the whole 36 UCI datasets published on the main web site of Weka platform validate their effectiveness.
منابع مشابه
Improved Heterogeneous Distance Functions
Instance-based learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores continuous attributes, requiring discretization to map continuous values into nominal values. This pa...
متن کاملProbabilistic Distance Measures for Prototype-based Rules
Probabilistic distance functions, including several variants of value difference metrics, minimum risk metric and ShortFukunaga metrics, are used with prototype-based rules (P-rules) to provide a very concise and comprehensible classification model. Application of probabilistic metrics to nominal or discrete features is straightforward. Heterogeneous metrics that handle continuous attributes wi...
متن کاملDissimilarity learning for nominal data
Defining a good distance (dissimilarity) measure between patterns is of crucial importance in many classification and clustering algorithms. While a lot of work has been performed on continuous attributes, nominal attributes are more difficult to handle. A popular approach is to use the value difference metric (VDM) to define a real-valued distance measure on nominal values. However, VDM treats...
متن کاملA Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset
Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...
متن کاملInterval MULTIMOORA method with target values of attributes based on interval distance and preference degree: biomaterials selection
A target-based MADM method covers beneficial and non-beneficial attributes besides target values for some attributes. Such techniques are considered as the comprehensive forms of MADM approaches. Target-based MADM methods can also be used in traditional decision-making problems in which beneficial and non-beneficial attributes only exist. In many practical selection problems, some attributes ha...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JSW
دوره 5 شماره
صفحات -
تاریخ انتشار 2010